Data visualization is a key part of data science. We can use dataviz in two main ways:
- During the research process Quick sketches, that highlight the internal patterns of data, and help us understand the phenomena.
- Reporting results Detailed visualization that help us in the storytelling of our publication.
Also, there are two main type of visualizations:
- Static For papers, posters, and printed stuff
- Interactive For computer-supported materials
So, we need a dataviz language that allows:
- Easy sketching
- Detail tailoring as needed
- Good static publication-level visualizations
- Good interactive visualizations.
We could also add:
- Easy to understand
- Reusable code
- With a large gamma of options
- With pre-made thematic plots
- In my personal experience, the combination ggplot (Wickham 2016) + plotly (Sievert 2020) + shiny (Chang et al. 2021) covers all these conditions
- This that not mean the it is the only or best option for any of the above points, but the one that better handles the trade-off between all.
- For example, the best known tool for interactive visualizations is neither R nor python, but the D3 library for javascript.
GGPLOT
In ggplot we think of plots as a succession of layers, which are built one at a time.
The + operator allows us to add new layers to the plot.
The ggplot() command allows us to define the data source and the variables that will determine the axes of the plot (x,y), as well as the color and shape of the lines or points, etc. All the mapped attributes go inside the aes()
The successive layers allow us to define:
- One or more types of graphics (geometries):
geom_col(),
geom_line()
geom_point()
geom_boxplot()
- titles and axis names
labs()
- plot styling
theme()
- axis scalses
scale_y_continuous,scale_x_discrete
- facetting
facet_wrap(),facet_grid()
for example
We can make a quick sketch in two lines of code
ggplot(penguins, aes(x = flipper_length_mm,y = body_mass_g,color = sex)) +
geom_point()

But we can also add as much detail as we want
plt <- penguins %>%
filter(!is.na(sex)) %>%
ggplot(., aes(x = flipper_length_mm,
y = body_mass_g,
color = sex)) +
geom_point() +
facet_wrap(~species) +
theme_minimal() +
scale_color_manual(values = c("darkorange","cyan4")) +
labs(title = "Penguin flipper and body mass",
subtitle = "Dimensions for male and female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER",
x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Penguin sex",
shape = 'Island') +
theme(legend.position = "bottom",
legend.background = element_rect(fill = "white", color = NA),
plot.title.position = "plot",
plot.caption = element_text(hjust = 0, face= "italic"),
plot.caption.position = "plot")
plt

- The pre-defined themes,
theme_* allows to quicky adjust the style of the plot, while with theme let us correct every possible detail.
interactivity
-Making this an interactive plot is only one extra line of code
ggplotly(plt)
`group_by_()` is deprecated as of dplyr 0.7.0.
Please use `group_by()` instead.
See vignette('programming') for more help
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.
The ggplot library in turn has many other libraries that extend its potential. Among my favorites are:
for example
library(GGally)
penguins %>%
select(species:bill_depth_mm) %>%
ggpairs(mapping = aes(color = species))

library(ggridges)
ggplot(penguins, aes(x = bill_length_mm, y = species, fill=species)) +
geom_density_ridges()

library(corrr)
penguins %>%
select(bill_length_mm:body_mass_g) %>%
correlate(.) %>%
network_plot(.)
Correlation method: 'pearson'
Missing treated using: 'pairwise.complete.obs'

- This allows for very specific plot types to be a one-line thing, without loosing the ability of adding our own details later (is not a black-box)
Shiny
Shiny is a package that allows to make interactive dashboards, where the user defines the parameters of the plots.
For me, this is a great tool in two workflows:
- As Supporting materials to a paper, to increase the engagement of the readers: i.e. https://ldaglobaltrade.uni.lu/dashboard/
- During the project development, to communicate intermediate results with the team: i.e. https://sciencebias.uni.lu/dev/rg_app/ (user:‘tmp,’ passwd: ‘user,’ please do not disclose)
Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke, Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara Borges. 2021.
Shiny: Web Application Framework for r.
https://CRAN.R-project.org/package=shiny.
Sievert, Carson. 2020.
Interactive Web-Based Data Visualization with r, Plotly, and Shiny. Chapman; Hall/CRC.
https://plotly-r.com.
Wickham, Hadley. 2016.
Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.
https://ggplot2.tidyverse.org.
LS0tCnRpdGxlOiAiV2h5IGRhdGF2aXogaW4gUj8iCm91dHB1dDogaHRtbF9ub3RlYm9vawpiaWJsaW9ncmFwaHk6IHJlZmVyZW5jZXMuYmliCi0tLQoKYGBge3Igc2V0dXAsIGluY2x1ZGU9RkFMU0V9CmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KGdhcG1pbmRlcikKbGlicmFyeShwYWxtZXJwZW5ndWlucykKbGlicmFyeShjb3VudHJ5Y29kZSkKbGlicmFyeShnZ3JlcGVsKQpsaWJyYXJ5KHBsb3RseSkKYGBgCgpEYXRhIHZpc3VhbGl6YXRpb24gaXMgYSBrZXkgcGFydCBvZiBkYXRhIHNjaWVuY2UuIFdlIGNhbiB1c2UgZGF0YXZpeiBpbiB0d28gbWFpbiB3YXlzOgoKMS4gICoqRHVyaW5nIHRoZSByZXNlYXJjaCBwcm9jZXNzKiogUXVpY2sgc2tldGNoZXMsIHRoYXQgaGlnaGxpZ2h0IHRoZSBpbnRlcm5hbCBwYXR0ZXJucyBvZiBkYXRhLCBhbmQgaGVscCB1cyB1bmRlcnN0YW5kIHRoZSBwaGVub21lbmEuCjIuICAqKlJlcG9ydGluZyByZXN1bHRzKiogRGV0YWlsZWQgdmlzdWFsaXphdGlvbiB0aGF0IGhlbHAgdXMgaW4gdGhlIHN0b3J5dGVsbGluZyBvZiBvdXIgcHVibGljYXRpb24uCgpBbHNvLCB0aGVyZSBhcmUgdHdvIG1haW4gdHlwZSBvZiB2aXN1YWxpemF0aW9uczoKCjEuICAqKlN0YXRpYyoqIEZvciBwYXBlcnMsIHBvc3RlcnMsIGFuZCBwcmludGVkIHN0dWZmCjIuICAqKkludGVyYWN0aXZlKiogRm9yIGNvbXB1dGVyLXN1cHBvcnRlZCBtYXRlcmlhbHMKClNvLCB3ZSBuZWVkIGEgZGF0YXZpeiBsYW5ndWFnZSB0aGF0IGFsbG93czoKCjEuICBFYXN5IHNrZXRjaGluZwoyLiAgRGV0YWlsIHRhaWxvcmluZyBhcyBuZWVkZWQKMy4gIEdvb2Qgc3RhdGljIHB1YmxpY2F0aW9uLWxldmVsIHZpc3VhbGl6YXRpb25zCjQuICBHb29kIGludGVyYWN0aXZlIHZpc3VhbGl6YXRpb25zLgoKV2UgY291bGQgYWxzbyBhZGQ6Cgo1LiAgRWFzeSB0byB1bmRlcnN0YW5kCjYuICBSZXVzYWJsZSBjb2RlCjcuICBXaXRoIGEgbGFyZ2UgZ2FtbWEgb2Ygb3B0aW9ucwo4LiAgV2l0aCBwcmUtbWFkZSB0aGVtYXRpYyBwbG90cwoKLSAqSW4gbXkgcGVyc29uYWwgZXhwZXJpZW5jZSosIHRoZSBjb21iaW5hdGlvbiBnZ3Bsb3QgW0BnZ3Bsb3RdICsgcGxvdGx5IFtAcGxvdGx5XSArIHNoaW55IFtAc2hpbnldIGNvdmVycyBhbGwgdGhlc2UgY29uZGl0aW9ucwotIFRoaXMgdGhhdCBub3QgbWVhbiB0aGUgaXQgaXMgdGhlIF9vbmx5XyBvciBfYmVzdF8gb3B0aW9uIGZvciBhbnkgb2YgdGhlIGFib3ZlIHBvaW50cywgYnV0IHRoZSBvbmUgdGhhdCBiZXR0ZXIgaGFuZGxlcyB0aGUgdHJhZGUtb2ZmIGJldHdlZW4gYWxsLgotIEZvciBleGFtcGxlLCB0aGUgYmVzdCBrbm93biB0b29sIGZvciBpbnRlcmFjdGl2ZSB2aXN1YWxpemF0aW9ucyBpcyBuZWl0aGVyIFIgbm9yIHB5dGhvbiwgYnV0IHRoZSBbRDMgbGlicmFyeV0oaHR0cHM6Ly9kM2pzLm9yZy8pIGZvciBqYXZhc2NyaXB0LgoKCiMgR0dQTE9UCgpJbiBgZ2dwbG90YCB3ZSB0aGluayBvZiBwbG90cyBhcyBhIHN1Y2Nlc3Npb24gb2YgbGF5ZXJzLCB3aGljaCBhcmUgYnVpbHQgb25lIGF0IGEgdGltZS4gICAgCgotIFRoZSBfX2BgYCtgYGBfXyAgb3BlcmF0b3IgYWxsb3dzIHVzIHRvIGFkZCBuZXcgbGF5ZXJzIHRvIHRoZSBwbG90LgoKLSBUaGUgYGBgZ2dwbG90KClgYGAgY29tbWFuZCBhbGxvd3MgdXMgdG8gZGVmaW5lIHRoZSBfX2RhdGEgc291cmNlX18gYW5kIHRoZSBfX3ZhcmlhYmxlc19fIHRoYXQgd2lsbCBkZXRlcm1pbmUgdGhlIGF4ZXMgb2YgdGhlIHBsb3QgKHgseSksIGFzIHdlbGwgYXMgdGhlIGNvbG9yIGFuZCBzaGFwZSBvZiB0aGUgbGluZXMgb3IgcG9pbnRzLCBldGMuIEFsbCB0aGUgbWFwcGVkIGF0dHJpYnV0ZXMgZ28gaW5zaWRlIHRoZSBgYWVzKClgIAoKLSBUaGUgc3VjY2Vzc2l2ZSBsYXllcnMgYWxsb3cgdXMgdG8gZGVmaW5lOgogICAKICAgIC0gT25lIG9yIG1vcmUgdHlwZXMgb2YgZ3JhcGhpY3MgKGdlb21ldHJpZXMpOgogICAgICAgIC0gYGBgZ2VvbV9jb2woKWBgYCwgCiAgICAgICAgLSBgYGBnZW9tX2xpbmUoKWBgYAogICAgICAgIC0gYGBgZ2VvbV9wb2ludCgpYGBgCiAgICAgICAgLSBgYGBnZW9tX2JveHBsb3QoKWBgYAogICAgLSB0aXRsZXMgYW5kIGF4aXMgbmFtZXMgYGBgbGFicygpYGBgCiAgICAtIHBsb3Qgc3R5bGluZyBgYGB0aGVtZSgpYGBgCiAgICAtIGF4aXMgc2NhbHNlcyBgYGBzY2FsZV95X2NvbnRpbnVvdXNgYGAsYGBgc2NhbGVfeF9kaXNjcmV0ZWBgYCAKICAgIC0gZmFjZXR0aW5nIGBgYGZhY2V0X3dyYXAoKWBgYCxgYGBmYWNldF9ncmlkKClgYGAKCgpmb3IgZXhhbXBsZSAKCldlIGNhbiBtYWtlIGEgcXVpY2sgc2tldGNoIGluIHR3byBsaW5lcyBvZiBjb2RlCgpgYGB7ciBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQpnZ3Bsb3QocGVuZ3VpbnMsIGFlcyh4ID0gZmxpcHBlcl9sZW5ndGhfbW0seSA9IGJvZHlfbWFzc19nLGNvbG9yID0gc2V4KSkgKwogIGdlb21fcG9pbnQoKSAKYGBgCkJ1dCB3ZSBjYW4gYWxzbyBhZGQgYXMgbXVjaCBkZXRhaWwgYXMgd2Ugd2FudAoKYGBge3IgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KcGx0IDwtIHBlbmd1aW5zICU+JSAKICBmaWx0ZXIoIWlzLm5hKHNleCkpICU+JSAKICBnZ3Bsb3QoLiwgYWVzKHggPSBmbGlwcGVyX2xlbmd0aF9tbSwKICAgICAgICAgICAgICAgICAgICAgICB5ID0gYm9keV9tYXNzX2csCiAgICAgICAgICAgICAgICAgICAgICAgY29sb3IgPSBzZXgpKSArCiAgZ2VvbV9wb2ludCgpICsKICBmYWNldF93cmFwKH5zcGVjaWVzKSArCiAgdGhlbWVfbWluaW1hbCgpICsKICBzY2FsZV9jb2xvcl9tYW51YWwodmFsdWVzID0gYygiZGFya29yYW5nZSIsImN5YW40IikpICsKICBsYWJzKHRpdGxlID0gIlBlbmd1aW4gZmxpcHBlciBhbmQgYm9keSBtYXNzIiwKICAgICAgIHN1YnRpdGxlID0gIkRpbWVuc2lvbnMgZm9yIG1hbGUgYW5kIGZlbWFsZSBBZGVsaWUsIENoaW5zdHJhcCBhbmQgR2VudG9vIFBlbmd1aW5zIGF0IFBhbG1lciBTdGF0aW9uIExURVIiLAogICAgICAgeCA9ICJGbGlwcGVyIGxlbmd0aCAobW0pIiwKICAgICAgIHkgPSAiQm9keSBtYXNzIChnKSIsCiAgICAgICBjb2xvciA9ICJQZW5ndWluIHNleCIsCiAgICAgICBzaGFwZSA9ICdJc2xhbmQnKSArCiAgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gImJvdHRvbSIsCiAgICAgICAgbGVnZW5kLmJhY2tncm91bmQgPSBlbGVtZW50X3JlY3QoZmlsbCA9ICJ3aGl0ZSIsIGNvbG9yID0gTkEpLAogICAgICAgIHBsb3QudGl0bGUucG9zaXRpb24gPSAicGxvdCIsCiAgICAgICAgcGxvdC5jYXB0aW9uID0gZWxlbWVudF90ZXh0KGhqdXN0ID0gMCwgZmFjZT0gIml0YWxpYyIpLAogICAgICAgIHBsb3QuY2FwdGlvbi5wb3NpdGlvbiA9ICJwbG90IikKCnBsdApgYGAKCi0gVGhlIHByZS1kZWZpbmVkIHRoZW1lcywgYHRoZW1lXypgIGFsbG93cyB0byBxdWlja3kgYWRqdXN0IHRoZSBzdHlsZSBvZiB0aGUgcGxvdCwgd2hpbGUgd2l0aCBgdGhlbWVgIGxldCB1cyBjb3JyZWN0IGV2ZXJ5IHBvc3NpYmxlIGRldGFpbC4gCgojIyBpbnRlcmFjdGl2aXR5CgotTWFraW5nIHRoaXMgYW4gaW50ZXJhY3RpdmUgcGxvdCBpcyBvbmx5IG9uZSBleHRyYSBsaW5lIG9mIGNvZGUKCgpgYGB7ciwgZmlnLndpZHRoPTEwLCBmaWcuaGVpZ2h0PTV9CmdncGxvdGx5KHBsdCkKYGBgCgoKCiMjICBbZ2dwbG90IGV4dGVuc2lvbnNdKGh0dHBzOi8vZXh0cy5nZ3Bsb3QyLnRpZHl2ZXJzZS5vcmcvZ2FsbGVyeS8pLgoKVGhlIGdncGxvdCBsaWJyYXJ5IGluIHR1cm4gaGFzIG1hbnkgb3RoZXIgbGlicmFyaWVzIHRoYXQgZXh0ZW5kIGl0cyBwb3RlbnRpYWwuIEFtb25nIG15IGZhdm9yaXRlcyBhcmU6CgoKLSBbZ2dhbmltYXRlXShodHRwczovL2dnYW5pbWF0ZS5jb20vKTogZm9yIGFuaW1hdGVkIHBsb3RzLgotIFtnZ3JpZGdlXShodHRwczovL2NyYW4uci1wcm9qZWN0Lm9yZy93ZWIvcGFja2FnZXMvZ2dyaWRnZXMvdmlnbmV0dGVzL2ludHJvZHVjdGlvbi5odG1sKTogZm9yIGZhY2V0ZWQgZGVuc2l0eSBwbG90cwotIFtnZ2FsbHldKGh0dHBzOi8vZ2dvYmkuZ2l0aHViLmlvL2dnYWxseS8pOiBmb3IgZ3JpZHMgb2YgcGxvdHMgYW5kIHNwZWNpZmljIHZpc3VhbGl6YXRpb25zCi0gW3RyZWVtYXBpZnldKGh0dHBzOi8vY3Jhbi5yLXByb2plY3Qub3JnL3dlYi9wYWNrYWdlcy90cmVlbWFwaWZ5L3ZpZ25ldHRlcy9pbnRyb2R1Y3Rpb24tdG8tdHJlZW1hcGlmeS5odG1sKSBmb3IgdHJlZW1hcHMKCgpmb3IgZXhhbXBsZQoKYGBge3IgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KbGlicmFyeShHR2FsbHkpCgpwZW5ndWlucyAlPiUgCiAgc2VsZWN0KHNwZWNpZXM6YmlsbF9kZXB0aF9tbSkgJT4lIApnZ3BhaXJzKG1hcHBpbmcgPSBhZXMoY29sb3IgPSBzcGVjaWVzKSkKYGBgCgoKYGBge3J9CmxpYnJhcnkoZ2dyaWRnZXMpCgpnZ3Bsb3QocGVuZ3VpbnMsIGFlcyh4ID0gYmlsbF9sZW5ndGhfbW0sIHkgPSBzcGVjaWVzLCBmaWxsPXNwZWNpZXMpKSArIAogIGdlb21fZGVuc2l0eV9yaWRnZXMoKQpgYGAKCgpgYGB7cn0KbGlicmFyeShjb3JycikKCnBlbmd1aW5zICU+JSAKICBzZWxlY3QoYmlsbF9sZW5ndGhfbW06Ym9keV9tYXNzX2cpICU+JSAKY29ycmVsYXRlKC4pICU+JQogIG5ldHdvcmtfcGxvdCguKQpgYGAKCgotIFRoaXMgYWxsb3dzIGZvciB2ZXJ5IHNwZWNpZmljIHBsb3QgdHlwZXMgdG8gYmUgYSBvbmUtbGluZSB0aGluZywgd2l0aG91dCBsb29zaW5nIHRoZSBhYmlsaXR5IG9mIGFkZGluZyBvdXIgb3duIGRldGFpbHMgbGF0ZXIgKGlzIG5vdCBhIGJsYWNrLWJveCkKCgoKCiMjIFNoaW55CgotIFNoaW55IGlzIGEgcGFja2FnZSB0aGF0IGFsbG93cyB0byBtYWtlIGludGVyYWN0aXZlIGRhc2hib2FyZHMsIHdoZXJlIHRoZSB1c2VyIGRlZmluZXMgdGhlIHBhcmFtZXRlcnMgb2YgdGhlIHBsb3RzLiAKCi0gRm9yIG1lLCB0aGlzIGlzIGEgZ3JlYXQgdG9vbCBpbiB0d28gd29ya2Zsb3dzOgoKMS4gQXMgU3VwcG9ydGluZyBtYXRlcmlhbHMgdG8gYSBwYXBlciwgdG8gaW5jcmVhc2UgdGhlIGVuZ2FnZW1lbnQgb2YgdGhlIHJlYWRlcnM6IGkuZS4gaHR0cHM6Ly9sZGFnbG9iYWx0cmFkZS51bmkubHUvZGFzaGJvYXJkLwoyLiBEdXJpbmcgdGhlIHByb2plY3QgZGV2ZWxvcG1lbnQsIHRvIGNvbW11bmljYXRlIGludGVybWVkaWF0ZSByZXN1bHRzIHdpdGggdGhlIHRlYW06IGkuZS4gaHR0cHM6Ly9zY2llbmNlYmlhcy51bmkubHUvZGV2L3JnX2FwcC8gKHVzZXI6J3RtcCcsIHBhc3N3ZDogJ3VzZXInLCBwbGVhc2UgZG8gbm90IGRpc2Nsb3NlKQoKLS0tLS0tLS0t